# README 


## Overview

The code in this replication package is the code used for the analysis in Aryal, Campbell, Ciliberto, and Khmelnitskaya (2023), Review of Economics and Statistics. The data are organized and estimated using STATA and some figures and tables are generated using MATLAB, R and LaTeX. This package can generate most of the figures and tables (including the main results) in the manuscript. The missing figures and tables are generated in either LaTeX, R or MATLAB. We provide Stata codes that generate the necessary data that can then be exported appropriately to generate those figures and tables.  

## Data Availability and Provenance Statements

- [ ] This paper does not involve analysis of external data (i.e., no data are used or the only data are generated by the authors via simulation in their code).



### Statement about Rights

- [X] We certify that the author(s) of the manuscript have legitimate access to and permission to use the data used in this manuscript. 
- [ ] We certify that the author(s) of the manuscript have documented permission to redistribute/publish the data contained within this replication package. Appropriate permission are documented in the [LICENSE.txt](LICENSE.txt) file.



### Summary of Availability

- [X] All data **are** publicly available.
- [ ] Some data **cannot be made** publicly available.
- [ ] **No data can be made** publicly available.
- [ ] Confidential data used in this paper and not provided as part of the public replication package will be preserved for ___ years after publication, in accordance with journal policies. 

### Details on each Data Source

No source files and "raw" data are made available with the replication package. However, below is a list of the four secondary data sources with an example source file that is used in the programs.


| Data.Name  | Data.Files | Location | Provided |  |
| -- | -- | -- | -- | -- | 
| "Main Dataset" | maindataset.dta | 1_data/ | TRUE | |
| "MSA Census" | rankmsamarket.dta | 1_data/ | TRUE | |
| "Affiliation" | affiliations.dta | 1_data/ | TRUE |  |
| "Distance"| distancecoupons.dta| 1_data/| TRUE ||
| "Weather" | weekly_weather_data.dta | 1_data/ | TRUE | |

where:

`Main Dataset` is the main dataset and it is constructed from the raw data from DB1B and T100 files from US Department of Transportation. The construction of this dataset is described in the Supplementary Matrial Appendix A.1. of the manuscript. DB1B refers to the U.S. Bureau of Transportation Statistics Passenger and Origin-Destination Survey (or so-called "DB1B" data). Information about the DB1B can be found on the [BTS website](https://www.transtats.bts.gov/Tables.asp). T100 is a survey administered by the Bureau of Transportation statistics that reports total passengers. These data are available on the [BTS website](https://www.bts.dot.gov/browse-statistical-products-and-data/bts-publications/%E2%80%A2-data-bank-28is-t-100-and-t-100f).


`MSA Census` ranks all the MSAs by their population size using US Census data.  


`Affiliation` provides regional airline to major airline ownership information by year and the ownership information data were collected by the authors.  


`Distance` gives distances between airports for all pairs of airports in the data.


`Weather` is airport-level weather conditions data from the Daily Global Historical Climatology Network as outlined in Menne et. al. (2012) as described in the Supplementary Appendix A.2. 



### Software Requirements

- [X] The replication package contains one or more programs to install all dependencies and set up the necessary directory structure.

- Stata (code was last run with version 16.1)
  

### Memory, Runtime, Storage Requirements


Data preparation were done in `STATA`. The codes should be run on a machine with at least 16gb of RAM.  


#### Summary

Approximate time needed to reproduce the analyses on a standard (2023) desktop machine:

- [ ] <10 minutes
- [ ] 10-60 minutes
- [ ] 1-2 hours
- [ ] 2-8 hours
- [ ] 8-24 hours
- [ ] 1-3 days
- [X] 3-14 days
- [ ] 14+ days
- [ ] Not feasible to run on a desktop machine, as described below.



### License for Code

The code is licensed under a MIT license. See [LICENSE.txt](LICENSE.txt) for details.


## Instructions to Replicators

Put the downloaded data from the folder `1_data/*` and run the do files in the folder `2_codes/*` to generate aux data that are currently in the folder `3_aux_output/*`. The codes should be run in the following sequence:  


### 1. Prep the main data 
- Run `2_code/MMC_Measure_All.do` using `maindataset.dta` to create `Regressiondataset.dta`


### 2. Add instrumental variables 
- Run `2_code/prepare_iv.do` to create `RegressiondatasetwithIVs.dta`


### 3. Regional HHI  
- Run `2_code/Regional_Concentration.do` to create `market_level_regional_hhi_data.dta`


### 4. Final dataset
- Run `2_code/Regressions_data.do` to create `finaldatasetforregression.dta`


### 5. Figures and Tables 
- Run `2_code/Summary_stats_figures.do` to create results for the tables and figures. Note that the file replicates the data necessary to generate Figures 1-3, and 5 and Tables 4, and Supplementary Appendix A.2.1. Figures 1 and 2 and Figures 3 and 5 in the manuscript are generated by MATLAB and R, respectively, using the  appropriate output from this `Summary_stats_figures.do` file. Those R and MATLAB codes are standard and not provided. 


### 6. Estimation Results 

- Run `/analysis/Regressions.do` using `finaldatasetforregression.dta` (from Step 4) to estimate the parameters and for inference necessary for Table 5. 




---
## List of Table/Figures and associated programs

The provided code reproduces:

| Figure/Table #    | Program                                  |     Note           |    
|-------------------|------------------------------------------|----------------------|
| Figures 1-3, 5    | `2_code/Summary_stats_figures.do`   | Figures 1,2 generated in MATLAB, Figures 3,5 in R                     |
| Figures 4         | none                                     |  LaTeX (Tikz) figure in the manuscript             |
| Tables 1-3        | none                                     | LaTeX table in the manuscript                                |
| Tables 4          | `2_code/Summary_stats_figures.do`   |   |
| Tables 5          |  `2_code/Regressions.do` |   |
| Table 6           | none                                     |   LaTeX table in the manuscript |
| Table A.2.1       | `2_code/Summary_stats_figures.do` |   Supplementary Materials: Appendix A.2 |



---
## References

Aryal, Gaurab, Dennis J. Campbell, Federico Ciliberto and Ekaternia Khmelnitskaya. 2023.  "Common Subcontracting and Airline Prices."

Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston. 2012. 
“An Overview of the Global Historical Climatology Network-Daily Database,” Journal of
Atmospheric and Oceanic Technology, 29 (7), 897–910.
 


---



```julia

```
